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Abstract. It is known to many field biologists that bio- 
surveys of natural communities tend to produce a J-shaped 
curve when the numbers of species are plotted against 
abundances. In other words, when the number of species of 
abundance k is plotted against k (running from 1 to some 
large number), the resulting distribution peaks at the lowest 
abundance, then forms a concave ramp as it approaches zero 
at the far end of the abundance axis. Does this distribution 
represent a single formula operating behind the scenes, or 
does it represent several formulas, appropriate for different 
types of community? Or does it represent no particular 
formula at all? 

The research reported here has three components: (1) The 
analysis of a new dynamical system that simulates multi¬ 
species communities (producing J-curves in the process) 
and the derivation of the “logistic-.!” distribution, as the 
underlying community equilibrium curve: (2) the summary 
of a general theory of sampling as a bridge between natural 
communities and samples of them; (3) the evaluation of 
extant proposals for species-abundance distributions by ap¬ 
plication of a general theory of sampling or by cross¬ 
comparison via 100 biosurveys randomly selected from the 
literature. 

Introduction 

A glance at the species/abundance distribution for almost 
any community of organisms surveyed in the literature 
reveals a distinct tendency for the community-as-sampled to 
have more species at lower abundances than at higher ones. 
In fact (he number of species per abundance tends to be 
highest a( the lowest abundance and thereafter to taper 
somewhat in the manner of the empirical distribution shown 
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in Figure 1. In biological folklore (if not in the literature), 
this curve is known as the “J-curve." owing to its resem¬ 
blance to a backwards letter J. 

To speak of “ the J-curve." however, begs a very large 
question. Is there a single theoretical distribution that un¬ 
derlies virtually all natural communities? Although it seems 
almost too much to ask, this may well be the case. 

Prior distributions 

In spite of the fact that the J-curve is a commonplace 
observation (Williams. 1964), one of the most popular the¬ 
oretical distributions, namely the lognormal distribution 
(Preston, 1948), shows little resemblance to it. As shown in 
Figure 2 (upper), the lognormal distribution is essentially a 
normal distribution that has been compressed at the low end 
and drawn out at the high end, both operations effected by 
a single logarithmic transformation. 

Conspicuously absent from the lognormal distribution is 
the sharp peak at the low abundance end. To save it from 
such a discrepancy, its proposer has postulated a “veil line" 
(See Fig. 2), a vertical line of truncation that lias the desired 
effect, more or less. Preston argued that samples of a natural 
community do not follow die same distribution as the com¬ 
munity itself: all the species below a certain abundance (the 
veil line) simply fail to show up in samples. 

This claim is fundamentally wrong (Dewdney, 1998). 
Indeed, species that fail to show up in a sample are veiled by 
a very different line, as shown in Figure 2 (lower). In this 
figure we use an unrealistic species/abundance distribution 
to illustrate the difference between the “veil line" and the 
“veil curve." Far from being a vertical, straight line, the true 
veil curve is a sloping, sigmoidal one. The species above ot¬ 
to the left of the veil curve will tend to he absent from the 
sample. As proved in the recently developed general math¬ 
ematical theory of sampling (Dewdney, 1998), the removal 
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Figure 1. A typical species-abundance distribution (McCabe and Weber. 1994). 


of these species cannot change the shape or formula of the 
distribution, only its parameter values. It follows that when 
we apply the veil curve to the lognormal distribution, we 
must get a new, complete lognormal distribution, not a 
truncated one. 

The metastudy reported in this paper compares two the¬ 
oretical proposals for the distribution of species abundance 
in natural settings. But it does not even consider the log¬ 
normal distribution: if the lognormal were present in nature, 
the first abundance categories would have to be smaller than 


succeeding ones. Not one case of this has turned up in the 
50 randomly selected biosurveys used in the metastudy. 
This observation, coupled with the new sampling theory, 
means that the lognormal distribution, as a descriptor of 
abundances in natural settings, is effectively dead. 

A closely related distribution, the negative binomial 
(Pielou, 1975), also uses the veil line concept and suffers 
from the same unrealistic shape when unveiled, so to speak. 
This distribution is therefore also not considered in the 
metastudy and for the same reasons. It is no longer usable as 
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Figure 2. Veil line and veil curve for ihe lognormal distribution. 
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a descriptor of natural abundances. One may suspect that a 
general theorv of sampling was not developed until now 
because of the confusion created by the mistaken concept of 
the veil line. 

The other leading contender for species/abundance dis¬ 
tribution of choice has been the log-series distribution de¬ 
veloped by C. B. Williams and R. A. Fisher (Fisher et ai . 
1943). As shown in Figure 3 below, it has the right general 
shape, being most sharply peaked at the low abundance end 
and tapering in concave fashion to zero. 

W illiams originally believed that the curve ought to be 
hyperbolic (Williams, 1964). Such a curve has the general 
form of 1/it, where k represents abundance. But Fisher 
pointed out that the area under the hyperbolic function was 
infinite, hardly desirable in a statistical distribution! He 
suggested altering the hyperbolic function by inserting a 
convergent series that forced the area under the curve to 
converge to a finite value. (There is no biological reason for 
this alteration.) The probability density function (pdf) is 
therefore 

x k /k, 

where k is an abundance and x k is the convergent series, .* 
being a parameter that is strictly less than 1 (but usually 
close to it). When used in the field, the distribution contains 
an additional factor a that reflects the number of individual 
organisms in the sample. But a is not a parameter, and the 
log-series is known as a one-parameter distribution. It has 


been noted (May, 1975) that the log-series distribution has 
points of superiority over the lognormal. 

Theory: a new individual-based dynamical model 

Independently of concerns about the state of theoretical 
abundance distributions, the author had constructed an in- 
dividual-based (Judson, 1994) dynamical system (Devvd- 
ney, 1997) that was originally intended as an exploratory 
tool for probing the abundance distributions of heavily 
predaceous communities such as stream benthic protists 
(Dewdney, 1996). In this model, an arbitrary number of 
species, each with an arbitrary population size, preyed on 
one another in the following manner: Within each iteration, 
two individuals (not species) are chosen at random. One 
individual ingests the other, reproducing in consequence. It 
is called the multispecies logistic system, or MSL system, 
for short. The adjective “logistic" was chosen because the 
total biomass (number of individuals) remained fixed as a 
simple consequence of the basic trophic act. Thus very 
abundant species were less likely to ingest other species as 
they approached the logistic limit. 

The MSL system was embedded in a computer program, 
written in Turbo Pascal and running on a 486 computer. One 
hundred iterated pair selections (births and deaths) make up 
one “cycle" (a programming convenience). After each cy¬ 
cle, the program displays a histogram of species versus 
abundances. It permits the user to select any number of 
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species, as well as initial abundances for each. The program 
comes equipped with an “extinction switch,” essential to the 
program’s usefulness. With the extinction switch “on,” any 
species having abundance 1 will, when eaten, disappear 
from the simulation. With the extinction switch “off,” spe¬ 
cies with abundance 1 will not be eaten, although they can 
(and do) end up preying on other species, with a possibility 
of subsequent further increase. 

With 200 species, each with initial abundance of 20, a 
486 computer takes about 10 minutes to drive the MSL 
system to equilibrium (extinction switch off). Initially, the 
species appear as a sharp spike at 20, then spread out into a 
binomial distribution with 20 as mean. However, the species 
continue to drift in abundance until the shape of the distri¬ 
bution changes radically. A peak forms at the low end, and 
a long tail appears on the right. 

Surprisingly, the process stops when the distribution 
curve reaches what can only be described as a “J-shape,” 
retaining roughly that shape for as long as the computer is 
run. The higher the initial average abundance, the shorter 


the initial spike. At very high abundances, there are only 
occasional, small spikes at the lowest abundance. 

The appearance of a J-curve invariably surprises those 
who would predict that a binomial (or normal) distribution 
must result or that, contrariwise, all but a few of the species 
will migrate to the low end. In the next section we show 
how a large metastudy of extant biosurveys has already 
begun to indicate that distributions produced by the MSL 
system cannot be distinguished statistically from typical 
biosurvey species abundance distributions. 

The MSL program is capable of calculating the average 
of the distributions it produces at each cycle. Figure 4 shows 
such an averaged distribution. The height of each bar in the 
histogram represents the average number of species that 
occupied the corresponding abundance category from the 
onset of equilibrium. Because the behavior of the MSL at 
the high abundance end has special interest, the frequencies 
have been inverted in Figure 4, appearing above the bars as 
a separate plot of isolated points. These will be examined 
presently. 



Figure 4. An average distribution produced by the rnultispecies logistic system, and a plot of inverted 
average frequencies. 
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Most natural communities of interest do not have the low 
average abundance used in the computer run for Figure 4. 
Instead of in the tens, average abundances may easily run 
into the thousands or even millions. In other words, the 
overall shape of the distribution shown in Figure 4 would be 
more typical of a sample than of a community. As men¬ 
tioned earlier, when the MSL system is run with such high 
abundances, there is little or no peak at the low abundance 
end. The distribution resembles instead the idealized pattern 
shown in Figure 6. as explained later. Nevertheless, when 
the extinction switch is turned on during such an equilib¬ 
rium state, species occasionally visit the low end, either to 
escape again or to become extinct. The time between suc¬ 
cessive extinctions grows at a modestly exponential rate. 

Far from being unrealistic, this is precisely what we 
expect in natural communities. For example, the island 
biogeography theory of R. H. Mac Arthur and E. O. Wilson 
(1967) recognizes that isolated communities naturally lose a 
certain percentage of species every year. They estimate that 
when the island of Krakatoa achieved equilibrium between 
immigrants and extinction losses in its bird community at 
roughly 27 species, the turnover rate was about 1.13% of 
species annually. 

Anticipating the discussion at the end of this article, we 
may imagine for the moment that the MSL describes birds 
as well as microorganisms. A community of 27 species of 
"birds*' with an average of 200 individuals per species 
typically loses about 9 species during 1000 cycles of oper¬ 
ation. Each cycle involves 100 reproductive events. If half 
the total bird population (i.e., females), namely 2700 indi¬ 
viduals, reproduce in one year, then 27 cycles corresponds 
to the passage of one year. Thus 1000 cycles corresponds to 
37 years and a loss of 33.3% of its species over the period. 
This corresponds to a rate of at least 0.9% of the species 
every' year. This figure is certainly close enough to the 
MacArthur and Wilson (1967) figure to make the only claim 
that is necessary in this context: the rate of species loss in 
the MSL system appears to be of the right order of magni¬ 
tude. To explore equilibrium conditions, the MSL system 
can be run with the extinction switch "off.” 

Variations of the underlying MSL dynamical system 
make little difference to the outcome. These have included 
(1) changing the food web so that predation follows a cyclic 
order, (2) redefining the food web to include four compart¬ 
ments: plants, herbivores, carnivores, and saprobes, (3) run¬ 
ning separate communities in which randomly selected in¬ 
dividuals may migrate from one "patch” to another. In all 
cases the same J-curve apparently re-emerges. This robust 
character of the J-eurve seems to indicate a phenomenon 
more fundamental than predation or other trophic behavior 
at w'ork. In fact, the essential feature of the MSL is that each 
species vibrates stochastically, in effect. In other words, 
each species in the system performs a constrained random 
walk in the sense that (a) at each abundance each species 


has an equal probability of decrease as increase, and (b) the 
total abundance of all species remains constant. Typically, 
species may be said to be in a stochastic orbit about the 
mean abundance, w ith a majority having less than the mean 
abundance at an\ time. 

Assuming only this fundamental property and assuming 
for the moment an infinite number of species, it is easy to 
prove that, at equilibrium, 

* •/(*) = <*+ 1 ). 

In other words, at equilibrium the probability of a species of 
abundance k increasing equals the probability of a species of 
abundance k 4- 1 decreasing. This equation has essentially 
only one solution, namely J{k) = 1 Ik. 

The foregoing analysis paves the way for the finite case. 
Since the number N of individuals in the MSL system 
remains constant during a run, no species can ever have 
abundance greater than N - R + 1. At equilibrium, in the 
ideal sense of this analysis, there must be a number A at 
(and beyond) which J[k) — 0. The number A may well be 
less than the absolute limit just cited. We assume (but 
cannot prove directly) that the function /is driven to zero in 
the following manner, 

p(k)-k-j[k) =p(k + !)•(* + ))-f(k 4- 1) 

Here we have postulated what dynamicists call a "forcing 
function,” which acts to drive the values of/ to zero at the 
limiting A-value of A. This equation can also be solved 
readily. 

f(k) = 1 /(*•/>(*)) 

If fi A) = 0 then there is an obvious singularity at k = A. The 
simplest function capable of such behavior is 

p(k) = (A - k) 1 

and the function / can therefore be rewritten, 
f(k) = (A - k)/k 

or, equivalently, 

f(k) = (1 - Sk)/k, 

where <S = 1/A. Anticipating the addition of a normalizing 
constant presently, we can multiply/by any constant we 
like in the process of developing a convenient mathematical 
expression for the density function. 

The logistic-J distribution 

The logistic-J distribution (discrete version) has the fol¬ 
lowing pdf: 

f(k) = c(\/k - 8); k = I to A, 
where the abundance k runs from 1 to a maximum A called 
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the outer limit and 5 is the inverse of A. The latter parameter 
is not a hard limit, but an average maximum, as will he 
made clear later. This particular pdf has one parameter. A, 
the constant c being simply a function of A. When the MSL 
system reaches equilibrium, one finds a species with abun¬ 
dance greater than A about half the time. 

In a more general setting, where the distribution is to 
apply equally to real communities and samples of them, it is 
useful to have the logistic-J distribution in continuous form: 

fix) = c(]/.v - 5), e < a < A, 

= o, elsewhere 

In this form an additional parameter, £, appears. Called the 
inner limit , it represents the average lower limit of abun¬ 
dances in a community or as reflected in a sample of that 
community. For example, abundances in a sample may be 
given as density data wherein the lowest abundance might 
be 0.25. Or sample abundances may start at 5, say. Super¬ 
ficially, the use of epsilon resembles a veil line, but it has 
nothing to do with sampling. Instead, it represents the 
average minimum abundance in the community of organ¬ 
isms per se. 

The constant c is simply shorthand for the standard nor¬ 
malizing constant for pdfs. In this case, 

c = (ln(A/e) - I r 1 

The pdf/is defined to be zero outside of the interval (e. A). 
As a mathematical convenience, we adopt the notation 


L(e, S) for the logistic-J distribution with parameters e 
and 6. 

In the case of the frequencies generated by the MSL 
system, as shown in Figure 3, the appropriate logistic-J 
distribution / has been calculated. To demonstrate that the 
distribution of frequencies produced by the MSL system 
does indeed appear to follow the logistic-J distribution, we 
have inverted the theoretical values, plotting them as a 
smooth curve, for comparison with the (inverted) model¬ 
generated values. It will be seen that the agreement is as 
close as can be expected, bearing in mind that the smallest 
statistical fluctuations at the high abundance end will, when 
inverted, produce relatively large fluctuations in the (point) 
plot shown in the figure. The overall trend is clear. The 
inverted theoretical curve approximates the inverted MSL 
points about as well as can be expected. The distribution 
produced by the MSL appears to be logistic-J. 

The parameters e and 5 define a logistic-J distribution 
completely. A useful visualization of the significance of 
these parameters is presented in Figure 5, which shows a 
standard hyperbola (fix) = I Lx) in relation to axes rendered 
in thick lines. Two other axis systems, rendered in lighter 
lines, are superimposed on the figure. In the first axis system 
the hyperbola has the formula l/.v, and in the other axis 
systems it has logistic-J formulas, to be explained presently. 
The logistic-J distribution corresponds to a section of the 
standard hyperbola, the origin of the section being deter¬ 
mined by the parameters e and 6. 

According to the sampling theory derived by the author 
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Figure 5. Logistic-J probability density functions based on the standard hyperbola. 
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(Dewdne\, 1998). it is possible to draw a direct relationship 
between a sample distribution and the distribution prevail¬ 
ing in the communitN from which the sample was drawn. 
The theory applies to all candidate distributions, including 
the logistic-J, where it has a particularly simple form. Sup¬ 
pose a held biologist samples a communitN of organisms 
with intensity /*. that is, observes/collects 100r# of the 
individuals in each species (to within the usual statistical 
fluctuations). If the distributions are logistic-J and the bi¬ 
ologist finds the sample abundances following /.(£. 5), 
then the community abundances follow the distribution 
r8). Thus if r 0.05 and the biologist finds 
Z40.3. 0.004), then he or she may reasonably estimate the 
community sample as following the distribution Li 6.0, 
0.0002) (in which the outer limit is therefore 5000). 

Idle sample distribution may be thought of as the 
left-hand hyperbolic section in Figure 5, while the com¬ 
munity distribution may be thought of as the one on the 
right. In this context, however, continuous figures are a 
little misleading. Since the community distribution is 
actually discrete and the abundances are normally much 
larger than those in a corresponding sample, we may 
represent the distribution of abundances in a community 
somewhat in the manner of the idealized diagram in 
Figure 6, in which individual species appear as small 
squares separated by spaces that increase in a modestly 
exponential manner from left to right. The very modest 
peak in the right-hand logistic-J distribution of Figure 5 
would correspond to the relatively small space between 
the first two species in Figure 6. 

The actual spacings shown above derive from the logis¬ 
tic-J distribution and are based on the hypothesis that com¬ 
munities actually follow' the logistic-J distribution. As such, 
the actual distribution would hardly appear so nicely ar¬ 
ranged. Clumps and gaps would abound, just as they do in 
the MSL system when it is operated with parameters that 
correspond to communities rather than samples of commu¬ 
nities. However, even if they follow some other kind of 
distribution, it must be a J-curve (according to Dewdney, 
1998) and the actual distribution would not be noticeably 
different from a perturbed version of the one shown in 
Figure 6. 

The hiosurvey metastudy 

For the past few years the author has been gathering 
abundance surveys from the literature. Called "biosur¬ 
veys" here, they cover four kingdoms of life (there being 


apparently few biosurveys of Bacteria or Archaea, if 
any). The\ were taken in polar, boreal, temperate, and 
tropical biomes of every type: terrestrial, freshwater, and 
marine. The intention is to include, ultimately, 100 ‘‘ran¬ 
domly selected" biosurveys in the study. Biosurveys se¬ 
lected for the study have three criteria to fulfill: They 
must (a) include at least 30 species, (b) not exclude 
uncommon or low abundance species, (c) not use num¬ 
bers that arc anecdotal or order-of-magnitude figures. So 
far, out of about 70 biosurveys selected at random. 50 
have passed these criteria (and these criteria alone), to be 
included in the study. 

Ideally, a “random selection" of biosurveys would 
require (a) a list of all the biosurveys ever taken and (b) 
a random number generator to select items from the list. 
Unfortunately, no such grand list exists. However, in the 
context of the metastudy reported here, “random" only 
needs to mean “not prejudicing the outcome." In other 
words, it makes no difference how biosurveys are se¬ 
lected from the literature, provided that nothing in the 
selection process tends to turn up surveys that favor the 
logistic-J distribution in some way. It is impossible, in 
any case, even by visual examination of abundance data 
in a typical biosurvey, to decide whether it would lit one 
distribution better than another. Even so, the research 
assistants who did most of the selecting were instructed 
to scan through Biological Abstracts and other databases 
and to note every biosurvey encountered. If the journal 
happened to be in our library, the assistant then went to 
the article in question and, if the total number of species 
was 30 or more, made a copy of the paper. The author 
then applied the remaining criteria, rejecting papers that 
failed to meet conditions (b) or (c), as stated in the 
previous paragraph, but accepting all others without prej¬ 
udice. 

Occasionally, in this process, the author would note that 
one kingdom of life or another was under-represented. The 
assistants were occasionally instructed to find more biosur¬ 
veys on fungi, protists, plants, or what have you. Besides the 
papers selected by assistants, the metastudy includes a bio¬ 
survey by the author (#1) and two papers by biological 
colleagues (#19, 20) who had heard of the study and vol¬ 
unteered their findings without knowing exactly what 1 
expected to find. 

At the present point, halfway through the study, the 
emphasis has been very much on the animal kingdom: 
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I'i^nrc ft. Idealized distribution of species in :i large community. 
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Fungi/Lichens 6 

Protista 3 

Plantae 5 

Animalia 36 


The Animalia set includes 10 fish surveys, 7 of birds, 1 of 
herptiles, 10 of insects, 2 of crustaceans, 1 of molluscs, 4 of 
“invertebrates;' and 1 of “macrofauna." The Plantae set has 
3 herbaceous plant surveys, 1 of trees and 1 of mosses. In 
various combinations, the surveys were conducted in 27 
temperate, 16 tropical/subtropical, and 7 boreal/polar loca¬ 
tions. General habitat types included 27 terrestrial, 11 fresh¬ 
water, and 12 marine. 

For each biosurvey selected for the study, a species- 
abundance histogram was created, as outlined in the fore¬ 
going section. Some biosurveys gave raw counts for a 
specific community of organisms, others gave density data. 
In all cases, the continuous version of the logistic distribu¬ 
tion was used, illustrating its great flexibility. The log-series 
was also applied to both kinds of survey data, according to 
the method outlined in Magurran (1988). Its normal range 
had to be extended (in a mathematically defensible way) to 
handle density and percentage data, however. This exten¬ 
sion did not detract from its ability to fit natural communi¬ 
ties. 

The chi-square test (Hays and Winkler, 1971) is normally 
used in goodness-of-fit applications to produce a statistic 
that describes how closely the theoretical distribution fits 
empirical data. In this study, however, the test was used in 
comparative mode, a legitimate practice that involved a 
direct comparison of chi-square scores to determine which 
theoretical distribution best fit the 50 biosurveys overall. 

Figure 7 shows a portion of two consecutive lines from a 
chi-square table. When the statistic has been computed for 
both the logistic-J and the log-series distributions in relation 
to a specific biosurvey, it might well be that one statistic had 
5 degrees of freedom and the other had 6. 

In the present context, the degrees of freedom to apply in 
a given instance of the chi-square test is determined by the 
number of abundance categories into which the data has 
been divided minus the number of independent parameters 
in the distribution. Since the log-series distribution has just 
one parameter, whereas the logistic-J has two, the log-series 
distribution was typically tested at one higher degree of 
freedom, an advantage that exactly compensates for the 


reduced descriptive power that accompanies fewer param¬ 
eters. 

Suppose tor example, that a particular biosurvey matches 
the logistic-J and the log-series with exactly the same chi- 
square value, say 5.321. The P value along the top of the 
table is simply the cutoff probability beyond which the lit 
would be rejected in normal applications. For example, a 
chi-square value of 5.321 at 5 degrees of freedom is less 
than 6.62568, and this means that the fit must be “accepted" 
at the 0.75 level. 

In fact, as normally used, the chi-square test, like all 
goodness of fit tests, works best as a rejector of tits. If the 
chi-square statistic were greater than 6.62568, then it could 
be rejected at the 0.75 level, meaning that one could reject 
the fit and be 75% positive that no mistake was made in the 
rejection. However, the test is not symmetrical in relation to 
“acceptance" and rejection. If accepted, we can say only 
that the fit was not rejected. Acceptance amounts to nothing 
like a proof that the accepted theoretical distribution is the 
actual underlying source of variation. Nor could it. There is 
an infinity of distribution functions that could be cooked up, 
all of them quite different from each other, all of them fitting 
the empirical data equally well. 

Although we will not be using the chi-square test in 
rejection/acceptance mode, the foregoing introduction 
serves to introduce the P values that are crucial to the 
metastudy reported here. 

Returning to the example where both the logistic-J and 
log-series happen, by coincidence, to have exactly the same 
chi-square score of 5.321, we can work out the correspond¬ 
ing P values by a simple process called linear interpolation. 
The P value gives us a direct comparison between the two 
scores. Thus at 5 degrees of freedom the chi-square value of 
5.321 corresponds to a P value of 0.607, while at 6 degrees 
of freedom, the chi-square value of 5.321 corresponds to a 
P value of 0.497. In thi s case then, the log-series chi-square 
score (0.497) would be superior to the logistic-J score 
(0.607). 

This example not only illustrates how more degrees of 
freedom translates, other things being equal, into a lower P 
value, but how the P values themselves make it possible to 
translate between chi-square scores at different degrees of 
freedom. Since the mapping between chi-square scores and 
their corresponding P values is 1-1, it is reversible. In other 
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Figure 7. Two lines from a chi-square table. 
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words, we can start with a chi-square score at one degree of 
freedom, map that score into a corresponding P value, then 
turn around and map the P value into a chi-square score at 
some other degree of freedom, it being guaranteed (by the 
delinition of the P value) that the scores will be comparable. 
The chi-square distribution w ith 10 degrees of freedom w as 
selected as the '’currency’' of choice, 10 being an interme¬ 
diate value over all the degrees of freedom that actually 
occurred in the metastudy. Each chi-square score, whether 
for the log-series or for the logistic-J distribution, was 
normalized in this fashion into the corresponding chi-square 
score at 10 degrees of freedom. 

Appendix Table 1 displays the raw’ chi-square score and 
the corresponding normalized scores (at 10 degrees of free¬ 
dom) for both the logistic-J and the log-series distributions, 
as applied to each biosurvey used so far in the metastudy. 
All chi-square scores were calculated by a program written 
in Turbo Pascal by the author and run for both sets of data. 
The right-hand column of the table displays the difference 
between the normalized scores. 

The average normalized score for the logistic-J distribu¬ 
tion over all 50 biosurveys was 10.653, while the average 
score for the log-series distribution was 12.949. The latter 
score is significantly higher, as revealed by a paired sample 
interval estimate (Wonnacott and Wonnacott, 1982). In this 
technique the paired differences are subjected to a means 
test specialized for paired data such as we treat in Figure 8. 
A confidence interval based on these data yields an average 
difference in normalized scores of 2.296 ± 1.547, which 
may be interpreted as follow's: the probability that the two 
means differ by less than 2.296 - 1.547 = 0.749 is 5%. 
Indeed, a better interval at 99 9c confidence of 2.296 ± 2.063 
can also be constructed. Here, with probability of only 1%, 
the tw'o means differ by no less than 2.296 — 2.063 = 
0.233. The difference, though small, is apparently real: With 
99% probability, the mean chi-square score for the logistic-J 
distribution is definitely lower than the mean chi-square 
score for the log-series distribution on the same data. The 
logistic-J distribution outperforms the log-series distribution 
in this sense. 

Not only are the test score means apparently different, but 
the average score of the logistic-J distribution also appears 
to be optimal or near-optimal, and in two ways. 

The mean of the chi-square distribution w'ith n degrees of 
freedom is exactly n, and the variance is 2 n (Hays and 
Winkler, 1971). Thus the chi-square distribution at 10 de¬ 


grees of freedom has a mean of 10.0. The average normal¬ 
ized chi-square score for the logistic-J distribution, 10.653, 
is obviously not far from optimal, whereas the average 
normalized chi-square score for the log-series distribution, 
12.949, is further away. 

Under the null hypothesis, a distribution that was the 
actual source of variation in the biosurvey data would tend 
to have a score of around 10. On the other hand, the rather 
high variance, 20.0 in this case, serves as a warning not to 
take the closeness too seriously. Even if the logistic-J had 
achieved an average normalized score of 10.0, it could 
easily have been as much as one standard deviation (4.47) 
away from the optimal score, and in either direction. 

The median of the chi-square distribution for a given 
number of degrees of freedom is the score that corresponds 
to a P value of 0.500. Under the null hypothesis for the 
chi-square distribution with 10 degrees of freedom, we 
would expect half the scores to be less than 9.342. As it 
happens, some 23 of the normalized logistic-J scores have 
this property, w hereas only 17 of the normalized log-series 
scores are less than 9.342. 

Taken together with the near-certain superiority of the 
logistic-J distribution over the log-series, this evidence may 
be interpreted as reasonably strong support for the hypoth¬ 
esis that abundances in natural communities follow' the 
logistic-J distribution. 

A final result is worth reporting. The version of the 
logistic-J distribution that appears in this study used an 
estimate for the parameter A based on the mean and lowest 
category frequency of the empirical distribution. The result¬ 
ing value of A may therefore be interpreted as a prediction 
of the maximum abundance for every biosurvey in the 
study. Since the predicted maximum abundance is only an 
average value, we would expect that if the predictions were 
accurate in this sense, the average value of the maximum 
abundances in the biosurveys would be fairly close to the 
average predicted values. 

To test this hypothesis, the ratio (percentage) of actual 
maximum abundance to that predicted by the logistic-J 
distribution was calculated for each biosurvey and the re¬ 
sults plotted as percentages, as in Figure 8. 

As it turns out, the average percentage ratio of maximum 
abundances is 99.1%. This means that the 50 predicted 
(average) maximum abundances behaved as would be ex¬ 
pected if the communities in question followed the logistic-J 
distribution. Although highly accurate, this result must 
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again be inteipreted with some caution, as the statistic is apt 
to suffer from a high variance. Nevertheless, this result also 
supports the hypothesis, and from a quite different direction. 

Summary 

There are two hypotheses implicit in the foregoing. The 
first hypothesis is that all natural communities follow the 
logistic-J distribution. The second hypothesis hinges on 
what level of interpretation is applied to the MSL system 
itself. 

The first hypothesis has little meaning until the word 
“community" is defined. We define a place as any con¬ 
nected volume within the biosphere, a rime as the period 
between two clock/calendar readings, and the supercoiwmi- 
nity connected with this place and time as the set of all 
living organisms within the space over the time in question. 
A “community," as we shall use the word, will be a subset 
of the supercommunity. Although somewhat too abstract to 
be very useful, we may restrict the meaning somewhat by 
allowing as “subsets" only taxonomically related organisms 
or those related by similar size or by being found in the 
same habitat type or, in general, any sense of the word 
habitually used in the field. Although the MSL system 
models only supercommunities, a compartmentalized ver¬ 
sion of the model reveals the same distribution obtaining 
within compartments (c.g., herbivores). 

The first hypothesis, that all natural communities follow 
the logistic-J distribution, has been supported by three sep¬ 
arate outcomes of the metastudy: 

First, the logistic-J distribution significantly outperforms 
the log-series distribution as a descriptor of abundances in 
communities. As already seen, the lognormal distribution, 
when truncated properly, has no resemblance at all to em¬ 
pirical data. With the two most commonly used distribu¬ 
tions thus eliminated, there remains no serious alternative to 
the logistic-J distribution: 

Second, the logistic-J has a normalized chi-square score 
that exceeds the median about half the time. Not only does 
it outperform the log-series distribution in this respect, but 
its closeness to the expected number of such scores, namely 
25, might be interpreted as another hint of optimality. 

Third, the scores of the logistic-J distribution on the 
biosurveys considered as a whole reproduce the chi-square 
distribution itself. This can happen only it the null hypoth¬ 
esis is always (or almost always) true of the biosurveys in 
the study. The possibility remains that another theoretical 
distribution is the proper one, but it will so closely resemble 
the logistic-J as to be perpetually indistinguishable from it, 
given the results of the metastudy so far. 

Fourth, the average outer limit predicted by the logistic-J 
distribution matches the average maximum abundance of 
the biosurveys themselves. This would also be true if all the 


biosurveys had the logistic-J as their underlying distribu¬ 
tion. 

The second hypothesis, concerning the mechanism un¬ 
derlying the logistic-J distribution, necessarily involves re¬ 
flection on the MSL system itself. But the MSL model has 
three levels of interpretation that arc mutually compatible, 
but successively more general. As originally intended, it 
was to reflect the high levels of predation to be found in 
stream benthic micro-environments (Dewdney, 1997). 

At the next level of interpretation, individuals are not 
ingesting each other, but merely trading biomass for repro¬ 
ductive enabling. This view covers not only predation, but 
competition for sunlight (as when one plant shades out 
another, taking biomass that would, in effect, have been 
accumulated by the shaded plant), and saprobic activity of 
fungi and bacteria. Obviously, this view stretches the MSL, 
system considerably but, as we have already seen, the basic 
model system is “detail hungry." When altered to employ 
fractional trophism or when modified to operate on the basis 
of definite food webs, it still produces J-curves. 

At the third level of interpretation, even the trophic ac¬ 
tivity is irrelevant. All that matters is that a given organism 
is as likely to reproduce as it is to die before reproducing. 
Although this may not be true over short periods of time for 
actual species, a certain long-term birth/death equiprobabil- 
ity surely prevails for every species that has survived to the 
present day. In other words, regardless of the individuals 
involved, a species has been as likely to increase, in the long 
run, as it was to decrease. The ratio, after all, is the number 
of successful reproductions divided by the number of 
deaths. 

The notions of a priori probabilities of death or repro¬ 
duction are not very useful, unfortunately. There is no w^ay 
to measure them and no way to predict the outcome even if 
they turned out to be equal. In direct contradiction to what 
any theorist (including the author) might have guessed, the 
seeming stability implied by equal probabilities of decline 
or abundance is an illusion. Instead of a normal (or even a 
lognormal) distribution, a J-curve invariably results. This 
would be just as true of any natural system obeying such an 
hypothesis as it is of the MSL system. Taken literally, the 
behavior of the MSL system would predict that most pop¬ 
ulations will appear to be regulated (Turchin, 1995), at least 
somewhat, by density over the short run, while appearing 
increasingly stochastic in the long run. 

We shall adopt a hypothesis that is nearly equivalent to 
this. At any time (seasonal and cyclic effects aside) and in 
any community, it is unpredictable whether the next change 
in the population of a given species will be an increase or a 
decrease. Indeed, we hypothesize that all species in all 
communities are continually undergoing what may be called 
“stochastic vibration." In the MSL system, species contin¬ 
ually orbit the mean population size. At any time, most have 
smaller populations, some have larger populations, and a 
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few have much larger ones. In a purely random manner, 
some very abundant species decline, ultimately to very 
small numbers, while others increase dramaticalh and for 
no apparent reason. Such an increase in a real community 
(that followed such a regimen) would usually have many 
causes that chanced to work together, or sometimes a single 
cause that happened to dominate all other factors. 

Such unpredictability does not amount to a claim of 
nondeterminism. Perhaps a reasonable analog) will be 
found in the stock market. Although thousands of invest- 
ment decisions, each of them deterministic for the individ¬ 
uals concerned, will play a role in the price of a stock over 
the period of a week, no one can predict the eventual effect 
of those decisions on the price. No one. after all. knows all 
the investors and their buying patterns. It is reasonably well 
understood that stock prices are “random” in this sense 
(Malkiel, 1985). 

Although it would be very difficult to test, the stochastic 
vibration hypothesis has one important philosophical impli¬ 
cation. It amounts to a confession of ignorance about the 
normal causes of change in abundance of populations, the 
contributing factors being in most cases beyond observation 
or calculation. It does not, however, signal a state of despair. 
It merely injects a note of realism into any project that 
would ascribe changes in abundance to single factors. 

Consider an individual plant, for example. Upon germi¬ 
nation and up to reproductive maturity, it may be killed 
by too much sun or too little, by excess cold or heat, by 
foraging animals, by fungal or other pathogenic attack, by 
parasites, by trampling, by overshadowing, by root compe¬ 
tition. by excess dryness or humidity, by Hooding, by envi¬ 
ronmental toxins, and so on. Most of these events are 
completely unpredictable, especially those driven by the 
weather—which, being chaotic, cannot be predicted with 
any certainty beyond a day or two. Over a season, some of 
the plants in a local community will succumb to one of these 
factors and die. 

To reproduce, a plant must first produce seeds. But the 
flower may not develop properly, the pollinating insect may 
not visit the plant, the pollinator may not be carrying the 
right pollen tin the case of cross-pollinating plants), and so 
on. When ovaries are fertilized, the game gets even rougher. 
That most plants produce rather large numbers of seed 
testifies to the fact that most seeds either do not germinate 
or die after germination. They may land on bad soil, be 
eaten by animal scavengers, be attacked by fungus, become 
desicated, and so on. 

The logistic-J distribution, including its underlying dy¬ 
namical system and the stochastic behav ior of its species, is 
here proposed as the major organizing factor present in all 
natural communities of living things. As such, it would have 
rather important implications in a number of fields, not least 
biodiversity assessment and the theory of evolution. Two 
brief remarks may serve for the time being. 


There are many different definitions of biodiversity, no 
two alike (Magurran. 1988). If it is ultimately concluded 
that most communities of living organisms follow the lo- 
gistie-J distribution, then a new and uniform approach to the 
problem of biodiversity assessment can be developed. One 
may calculate the “biodiversity” of a community of organ¬ 
isms. not as a single number (a hopeless project [Gaston. 
1995]) but as a triplet, (R. e. A). These numbers would be 
estimates of those parameters for the community as a whole, 
derived via samples that are subjected to the transformations 
outlined in Dewdney (1998). And the abundances in such 
communities can be largely reconstructed from these num¬ 
bers, although our theory says nothing about which species 
would have which abundances. 

In the theory of evolution, it might be asked whether the 
stochastic vibrations hypothesized here for species might 
also prevail at the generic and higher taxonomic levels. 
Williams (1964) observed that the J-curve also emerges if 
one plots genera against species, not in a community this 
time, but in standard taxonomic lists. For example, if one 
counts the number of bird genera that have I species, 2 
species, and so on. a J-curve emerges. This can be done 
within families or orders. It may be that genera “vibrate” in 
the sense that, through evolutionary time, they lose and gain 
species more or less at random (i.e., unpredictably and with 
no overall discernible pattern). 

Finally, the J-curve, whether one regards it as being 
logistic-J or not, tells us that within any community of 
organisms, especially somewhat isolated or patchy ones, 
there will be many species with relatively small popula¬ 
tions—far more than is commonly realized, even by many 
field biologists. Such populations will be more readily sub¬ 
ject to mutational change, since new genes have a much 
better chance of spreading through them. From this view 
point, the low abundance end of the J-curve may be iden¬ 
tified not only as the grave of evolution, but its cradle, as 
well. 
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Appendix 

Table i 


Chi square scores am! corresponding normalized scores for the logistic- 
J and Un*-series distributions on 50 bio.suncxs 
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